Lecture 06 H testing and simple tests II

Bill Perry

Brief review

  • H test for a single population
  • 1- and 2-sided tests
  • H test for two populations
  • Assumptions of parametric tests

Lecture 5 overview

  • Assumptions of parametric tests
  • Statistical vs. biological significance
  • Robust tests
  • Rank-based tests
  • Permutation tests
  • Assignment 1

Lake Trout

source

Grayling

Assumptions of parametric tests

  • T-tests are parametric tests

  • Parametric tests: specify/assume probability distribution from which parameters came

  • Non-parametric tests: no assumption about probability distribution

  • Mukasa et al 2021 DOI: 10.4236/ojbm.2021.93081

Assumptions of parametric tests

  • If assumptions of parametric test violated, test becomes unreliable
  • This is because test statistic may no longer follow distribution
  • Most parametric tests robust to mild/moderate violations of below assumptions

Assumptions of parametric tests

  • Basic assumptions of parametric t-tests:
  • Normality, equal variance, random sampling, no outliers
  • Normality: Samples from normally distributed population
    • Graphical tests: histograms, dotplots, boxplots, qq-plots
    • “Formal” tests: Shapiro-Wilk test

<

Assumptions of parametric tests

  • Equal variance: samples are from populations with similar degree of variability
    • Graphical tests: boxplots
    • “Formal” tests: F-ratio test
  • Parametric tests most robust to violations of normality and equal var. assumptions when samples sizes equal

Assumptions of parametric tests

  • Normality, equal variance, random sampling, no outliers
  • Random sampling: samples are randomly collected from populations; part of experimental design
  • Necessary for sample -> population inference

<>

Assumptions of parametric tests

  • Normality, equal variance, random sampling, no outliers
  • No outliers: no “extreme” values that are very different from rest of sample
  • Graphical tests: boxplots, histograms
  • “Formal tests”: Grubb’s test
  • Note: outliers also problem for non-parametric tests

<>

Statistical vs. biological significance

  • Statistical significance: difference unlikely due to chance
  • Says nothing about biological significance of difference!
  • With large sample size can detect very small differences between populations
  • E.g.: consider 2 snail populations,
    • A and B:
      • Ho: µ~size A~ = µ~size B~
      • Ha: µ~size A~ ≠ µ~size B~

<!– –>

Statistical vs. biological significance

  • Size of A: 5.05 (± 2.00 SD)mm, size of B: 5.00 (± 2.00 SD)mm
  • Sample 50, 200, 30,000 individuals from each pop:
    • n = 50: t = 0.32, df = 98, p-value = 0.75
    • n = 200: t = 0.058, df = 398, p-value = 0.95
    • n = 30,000: t = -4.47, df = 59998, p-value = 7.996*10-6

Statistical vs. biological significance

  • Finally, statistically significant difference…
  • Meaningful? Ecologically significant? Statistics can’t answer this question
  • IMPORTANT to report info that can assess biological significance
  • “A two-tailed, two-sample independent t-test showed significant difference in size between pop. A (4.99 mm ± 1.99 SD) and pop. B (5.06 mm ± 1.99 SD) at á=0.05 (t = -4.47, df = 59998, p-value < 0.0001).”

<!– –>

Assumptions of parametric tests

  • Basic assumptions of parametric t-tests:
  • Normality, equal variance, random sampling, no outliers
  • What to do if assumptions are violated?

<>

Homework take-up

  • t-tests have several assumptions. Alternative tests, with more relaxed assumptions, are available to statisticians. In which case would you use the following tests?
    • Welch’s t-test: when distribution normal but variance unequal
    • Permutation test for two samples: when distribution not normal (but both groups should still have similar distributions and ~equal variance)
    • Mann-Whitney-Wilcoxon test: when distribution not normal and/or outliers are present (but both groups should still have similar distributions and ~equal variance)

< –>

Assumptions of parametric tests

  • QQ-plots: tool for assessing normality
  • On x- theoretical quantiles from SND
  • On y- ordered sample values
  • Deviation from normal can be detected as deviation from straight line

<!– –>

Assumptions of parametric tests

  • In some cases, data can be mathematically “transformed” to meet assumptions of parametric tests

Robust tests

  • Welch’s t-test: common “robust” test for means of two populations
  • Robust to violation of equal variance assumption, deals better with unequal sample size
  • Parametric test (assumes normal distribution)
  • Calculates a t statistic but recalculates df based on samples sizes and s

<!– –>

Robust tests

  • In R:
  • t.test(y1, y2, var.equal = FALSE, paired = FALSE)
  • will use the Welch approach
  • T-test
    AvB df= 38 t= -3.62 p= 0.0009
  • AvC df= 38 t= -2.91 p= 0.005

Welch’s

  • AvB df= 37.9 t= -3.62 p= 0.0009

  • AvC df= 26.1 t= -2.91 p= 0.007

<!– –>

Rank based tests

  • Rank-based tests: no assumptions about distribution (non-parametric)
  • Ranks of data: observations assigned ranks, sums (and signs for paired tests) of ranks for groups compared
  • Mann-Whitney U test common alternative to independent samples t-test
  • Wilcoxon signed-rank test is alternative to paired t-test

Rank based tests

  • Assumptions: similar distributions for groups, equal variance
  • Less power than parametric tests
  • Best when normality assumption can not be met by transformation (weird distribution) or large outliers

A: n= 15, y= 8, s= 4 B : n= 15, y= 10, s= 5

Approach A vs. B

T-test df= 28 t= -3.53 p= 0.0014 M-W U (Wilcoxon’s) W= 41 p= 0.002

Permutation tests

  • Permutation tests based on resampling: reshuffling of original data
  • Resampling allows parameter estimation when distribution unknown, including SEs and CIs of statistics (means, medians)
  • Common approach is bootstrap: resample sample with replacement many times, recalculate sample stats

Permutation tests

  • Sample A: n = 40, ȳ= 1.72, s = 4.17
  • Sample B: n = 35, ȳ= 4.50, s = 4.83
  • Ho: µA = µB, Ha: µA ≠µB
  • Calculate ∆ in means between two groups (2.78)

<!– –>

Permutation tests

  • Randomly reshuffle observations between groups (keeping nA=40 and nB=35), calculate ∆
  • Repeat >1,000 times
  • Record proportion of the ∆means is ≥2.94 µmol
  • This is equivalent to p-value and can be used in “traditional” H test framework
  • For a graphical explanation:

Permutation tests

  • In R (using ‘perm’ package):
  • permTS(y1, y2, alternative = “two.sided”, method = “exact.mc”, control = permControl(nmc = 10000))
  • Assumptions: both groups have similar distribution; equal variance

R practice

  • Get practice doing basic t-tests
  • Alternatives in next lecture
  • Dataset (squirrel_data.csv) and lab instructions on Canvas
  • Answer questions in bold
  • Due end of Thursday